The data I am using is data for my favorite hockey team Los Angeles Kings. I have downloded the data from NHL.com/stats. This data is from regular seasons 2005-06 to 2017-18.
Let’s take a look at the structure of the data.
## 'data.frame': 1032 obs. of 23 variables:
## $ Team : Factor w/ 1 level "LA Kings": 1 1 1 1 1 1 1 1 1 1 ...
## $ Date : Factor w/ 1032 levels "1/1/08","1/1/11",..: 277 279 291 161 173 189 203 209 222 233 ...
## $ Month : Factor w/ 8 levels "Apr","Dec","Feb",..: 7 7 7 7 7 7 7 7 7 7 ...
## $ Year : int 2005 2005 2005 2005 2005 2005 2005 2005 2005 2005 ...
## $ Venue_Home : int 0 1 1 1 1 1 0 0 1 1 ...
## $ Venue_Away : int 1 0 0 0 0 0 1 1 0 0 ...
## $ Against_Team : Factor w/ 31 levels "ANA","ARI","ATL",..: 11 2 15 13 12 7 10 11 8 1 ...
## $ Against_Division: Factor w/ 4 levels "Atlantic","Central",..: 2 4 2 4 1 3 2 2 4 4 ...
## $ Win : int 0 1 1 1 0 1 1 1 0 1 ...
## $ Loss : int 1 0 0 0 1 0 0 0 1 0 ...
## $ OT_Loss : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Points : int 0 2 2 2 0 2 2 2 0 2 ...
## $ Goals_For : int 4 3 2 3 2 3 5 7 2 3 ...
## $ Goals_Against : int 5 2 1 1 5 1 4 2 3 1 ...
## $ SO_Win : int 0 0 0 0 0 0 0 0 0 0 ...
## $ SO_Loss : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Shots_For : int 20 28 32 35 28 31 30 28 20 33 ...
## $ Shots_Against : int 30 29 25 24 21 20 21 30 31 30 ...
## $ PP_Oppor : int 4 4 6 10 7 9 3 4 10 8 ...
## $ PP_Goals_For : int 1 2 0 2 0 2 0 1 1 1 ...
## $ PP_Goals_Against: int 2 2 0 0 0 0 2 2 2 0 ...
## $ PP_Percentage : num 25 50 0 20 0 22.2 0 25 10 12.5 ...
## $ PK_Percn : num 83.3 60 100 100 100 100 0 66.7 80 100 ...
Let me describe the datapoints:
Team: LA Kings
Date: Date of the game
Month: Month of the game
Year: Year of the game
Venue_Home: 1 for true, 0 for false
Venue_Away: 1 for true, 0 for false
Against_Team: Opponent
Against_Division: Division of opponent
Win: 1 for win, 0 for loss
Loss: 1 for loss, 0 for win
OT_Loss: Overtime loss
Points: Points
Goals_For: Goals scored
Goals_Against: Goals scored against
SO_Win: Shoot out win
SO:Loss: Shoot out loss
Shots_For: Total shots for
Shots_Against: Total shots against
PP_Oppor: Powerplay opportunities
PP_Goals_For: Goals scored in powerplay
PP_Goals_Against: Goals against in powerplay
PP_Percentage: Powerplay scoring percentage
PK_Percn: Penalty kill percentage
I made some changes in the dataset to make it little more tidy. I created separate columns for month and year from date coloumn. I combined the over time losses into Loss coloumn which can give me accurate number of losses in one coloumn. I also separated venue home and away into separate coloumns. Since division information was missing, I added coloumn with division information. All these changes can make working with this dataset a lot easier.
Let’s start working on the data.
## Total games played: 1032
## Total wins: 512
## Total losses: 520
## Total home games: 516
## Total road games: 516
## Total points: 1141
## Total goals for: 2728
## Total goals against: 2686
## Total powerplay opportunites: 4011
## Total powerplay goals for: 716
## Summary of ppowerplay success rate:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 0.00 12.50 17.81 33.30 100.00
## Summary of penalities killed:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 66.70 100.00 81.78 100.00 100.00
First, I would like to plot games played per year.
Highest number of games were played in 2013, followed by 2009. One thing that surprised me is that Kings did win Stanley Cup in 2012 but chart shows only 43 games in that year. After doing some Google search, I found out that 2012-13 season was delayed due to issues between NHL and franchise owners. The first game of 2012-13 season was played on Jan 19, 2013 which means not a single game was played in 2012.
Let’s plot line graph for this data.
Looks like there is no pattern in this data. Moving on to next explorations.
Let’s plot points per year.
Wow! Kings only crossed 100 points mark twice between 2005 to 2018. And both times Kings won Stanley Cup. So, crossing 100 points mark is a good indication that team will do well in play-offs. Also, there is another indication of poor performance in 2017 season and bad start of 2018 season. I would like to see same data in line chart to see the ups and downs.
Let’s see how line chart looks like for points per year.
Big drop in 2012, as we noted above, it was due to late start of 2012-13 season.
Next, I would like to plot winning total over the years.
So many details in this chart. The year 2013 was most successful with 52 wins, due to which, Kings won 2013-14 Stanley Cup. The 2018 is very strange, it explains the story of Kings’ bad performance in 2017 season and poor start of 2018 season. In this chart, we can also see that Kings are getting better since 2010. To see it more clearly, let’s plot the line chart.
This chart gives us little more clear picture. As you can see, Kings’ performance got better after 2009, but 2012 and 2018 have been bad years. The pattern shows that Kings bounce back every 5-6 years.
So, most wins are against Arizona Coyotes (35 wins). Second best record is against Dallas Stars (34 wins), followed by Edmonton Oilers and Anaheim Ducks with 33 wins against each team. I am little surprised, but very happy to see results against Ducks. I also noticed the ATL team that I didn’t reconize. Google search revealed that it was Atlanta Thrashers, this team is known as Winnipeg Jets since 2011.
I am little dissapointed to see the results against their 2 big rivals, San Jose Sharks and Anaheim Ducks. Sharks and Ducks beat Kings 46 and 44 times respectevely. Kings also lost to Coyotes 42 times. The worst part of this picture is that all these 3 teams are from same division. I will ignore the results against Vegas (VGK) since Vegas only played one season.
Next, I would like to plot wins by home and road games. First, I will prepare data frame that can make it easy for me to plot graphs.
## # A tibble: 6 x 4
## Year Venue_Home Venue_Away Win
## <int> <int> <int> <int>
## 1 2005 13 12 25
## 2 2006 22 9 31
## 3 2007 15 12 27
## 4 2008 19 14 33
## 5 2009 18 24 42
## 6 2010 25 20 45
Looks good, let’s plot the bar graph.
Most home wins came in the year of 2013 (32 wins), followed by 2016 (26 wins), and 2010 with 25 wins. Some years were really bad with under 20 wins. I want to plot line to identify patterns.
I don’t see a clear pattern in this data. Looks like Kings had good years and bad years.
Next, I would like to plot wins on the road.
Looks like Kings have been consistent on the road since 2009, except a couple years. They performed well on the road in 2011 and 2013 which helped them to win Stanley Cup in 2012 and 2014. This is a indication that road wins play a big role in winning the cup.
Ok, let’s take a look at losses at home now. Once again, I will prepare the data frame for easier plotting.
## # A tibble: 6 x 4
## Year Venue_Home Venue_Away Loss
## <int> <int> <int> <int>
## 1 2005 6 10 16
## 2 2006 23 28 51
## 3 2007 24 30 54
## 4 2008 25 21 46
## 5 2009 16 28 44
## 6 2010 16 17 33
Nice. Let’s plot the bar graph.
2006 to 2008 were some rough years, Kings lost more than 20 games in each year. Kings performed well at home from 2012 to 2016. 2005 was the best year for home record. Next, I would like to plot losses on the road.
Kings lost 20+ games on the road for most years. This is something they should work on. 2012 records looks great, I remember Kings were known as “Kings of the road”" in 2012 and that performance led to winning the Stanley Cup championship.
Kings played highest number of games in 2013, and lowest number of games in 2005. Kings won 512 games and lost 520 games. They scored 1141 points total between 2005 and 2018. They also scored highest number of points in 2013, which also means most wins in the same year. Kings did not perform well in 2017, followed by poor start of 2018 season. Kings beat Coyotes 35 times, Dallas 34 times, followed by 33 wins against Oilers and Ducks each. Sharks beat Kings 46 times, and Kings lost to ducks 44 times. This can hurt bad since both Sharks and Ducks are in same division with Kings.
As we can see in the plot, Kings lost more games than they won, which is a negative overall performance. We lost more than 50% games in those 13 years. This is not good.
I wonder how records stand against each team. A comparison plot can give us answers.
Once again, records against Sharks and Ducks looks very bad. These 2 teams are in the same divison along with Kings, which makes it even more deadly. Other than that, Kings are neck to neck with other teams.
Next, I would like to compare home vs away records.
Home/Away records look balanced for most years, except a couple seasons. As we know, Kings played equal number of games at home and on the road. Let’s plot line chart for this data.
One thing that stood out to me is that there is a big difference in the year of 2009. Kings played more games on the road. Also, Kings did not qualify for play-offs in that season. Did more road games play role in it? I believe it is certainly possible. I would like to explor wins/losses record based on the venue.
First, I will check the total wins at home and on the road.
## Total wins at home: 285
## Total wins on the road: 227
Nice, just as expected, more wins at home. Now, let’s plot the line graph.
Home dominance is clearly visible from 2005 to 2018. There are only 2 years when they won more games on the road. In hockey, winning away games is very crucial.
Let’s do similar analysis for losses.
## Total losses at home: 231
## Total losses on the road: 289
Another indication shows that Kings have to work on their road record. I would like to the plot line graph to see yearly road performance.
Ok, so Kings consistently lost more road games over the years. This is a huge area for improvement.
After comparing multiple variables, I found some more data about Kings performance between 2005 and 2018. Kings lost more games than they won. Performance against Sharks (30 wins, 46 losses) and Ducks (33 wins, 44 losses) was full of disappointment. Kings had good records against Oilers, Bruins, Jackets, and Avalanche. Kings had best road records in 2009.
In next section, I would like analyze the records against each division.
I would like to plot separate bar chart for each division to see how it looks next to each other.
Ok, so as we can see in the charts, Kings have best record against Pacific division. Kings’ winning numbers are shrinking as they move away from pacific ocean towards atlantic ocean.
Let’s see Kings’ losses record in similar fashion.
Wow! the bars are once again high against pacific division. They lost a lot of games against pacific division. It is clear that for Kings to be a successful team, they have to conquer pacific divison.
Kings are playing a lot of games against pacific division teams, it is beacuse Kings are from pacific division. Their record does not seem to be that great against pacific division. They must win against this division if they are dreaming of another Stanley Cup.
This first chart I picked is Kings’ losses record against each team.
The reason I picked this chart is that Kings must find a way to beat 3 teams. The 3 teams are Sharks Ducks, and Coyotes. They are losing most of their games against these 3 teams.
The second chart I picked is about Kings’ on the road losses.
I think this chart is also very important. Kings are losing way too many games on the road. They have to work on their road records and figure out how to win away from home.
The reason for this chart is that Kings had very poor record against pacific division. Their record against central division is not that great either. I think this weakness should be a top priority for Kings.
I have been watching Kings’ hockey since 2005-06 season. After working on this project, I found a lot more details that I was not aware of. I certainly learned a lot about hockey data while working on this project. From now on, I will look at hockey stats on the TV during the game from a different perspective.
First of all, it was kind of challenging to find the data I wanted. I finally came across NHL/stats website where I was able to gather the data. The data was little challenging to work on. I had to do some data wrangling to format the data the way I wanted. For example, I separated home/away venue, separated year into different coloumn, added division data, merged losses and overtime losses into one coloumn etc. One big piece missing from this data was seasons. If I had more time, I would have added season information into this data so I can also analyze records by season.
After my analysis, I would suggest Kings to work on 3 weaknesses: they have to figure out a way to win more games against Sharks and Ducks, they must win a lot more games on the road, and they have to imporove their record against pacific and central divisions. If they can fix these 3 issues, I am sure I will be visiting the downtown soon to watch another victory parade.
Analysis By: Jatinder Dhandi
Fan: LA Kings